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Data Integration system 



(57) An enterprise integration system is coupled to 
a number of legacy data sources. The data sources 
each use different data formats and different access 
methods. The integration system includes a back-end 
interface configured to convert input data source infor- 
mation to input XML documents and to convert output 
XML document to output data source information. A 
front-end interface converts the output XML documents 



to output HTML forms and the input HTML forms to the 
XML documents. A middle tier includes a rules engine 
and a rules database. Design tools are used to define 
the conversion and the XML documents. A network cou- 
ples the back-end interface, the front-end interface, the 
middle tier, the design tools, and the data sources. Mo- 
bile agents are configured to communicate the XML 
documents over the network and to process the XML 
documents according to the rules. 
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Description 

Field of the Invention 

[0001] This invention relates generally to computer- 
ized applications, databases, and interface, and more 
particularly to integrating applications, databases, and 
interfaces having different formats, contexts, and de- 
signs. 

Background of the Invention 

[0002] Computer and computer-related technology 
have enabled the use of computers in numerous enter- 
prise functions. Almost every facet of a modern enter- 
prise is supported by computer systems in some man- 
ner. Computerization is a necessity to allow an enter- 
prise to remain functional and competitive in a constant- 
ly changing environment. 

[0003] Computer systems are used to automate proc- 
esses, to manage large quantities of information, and to 
provide fast and flexible communications. Many enter- 
prises, from sole proprietorships, small stores, profes- 
sional offices and partnerships, to large corporations 
have computerized their functions to some extent. Com- 
puters are pervasive, not only in business environment, 
but also in non-profit organizations, governments, and 
educational institutions. 

[0004] Computerized enterprise functions can include 
billing, order-taking, scheduling, inventory control, 
record keeping, and the like. Such computerization can 
be accomplished by using computer systems that run 
software packages. There are many application soft- 
ware packages available to handle a wide range of en- 
terprise functions, including those discussed above. 
[0005] One such package is the SAP R/2(™) System 
available from SAP America, Inc., 625 North Governor 
Printz Blvd., Essington, Pa. 19029. The SAP R/2 Sys- 
tem is a software package designed to run on IBM or 
compatible mainframes in a CICS (Customer Interface 
Control System) or IMS (Information Management Sys- 
tem) environment. For example, SAP may use CICS to 
interface with user terminals, printers, databases, or ex- 
ternal communication facilities such as IBM's Virtual Tel- 
ecommunications Access Method (VTAM). 
[0006] SAP is a modularized, table driven application 
software package that executes transactions to perform 
specified enterprise functions. These functions may in- 
clude order processing, inventory control, and invoice 
validation; financial accounting, planning, and related 
managerial control; production planning and control; 
and project accounting, planning, and control. The mod- 
ules that perform these functions are all fully integrated 
with one another. 

[0007] Another enterprise area that has been compu- 
terized is manufacturing. Numerous manufacturing 
functions are now controlled by computer systems. 
Such functions can include real-time process control of 



discrete component manufacturing (such as in the au- 
tomobile industry), and process manufacturing (such as 
chemical manufacturing through the use of real-time 
process control systems). Directives communicated 
5 from the computer systems to the manufacturing oper- 
ations are commonly known as work orders. Work or- 
ders can include production orders, shipping orders, re- 
ceiving orders, and the like. 

[0008] However, the computerization of different func- 
10 tions within a single enterprise has usually followed sep- 
arate evolutionary paths. This results in incompatibility 
between the different systems. For example, transac- 
tions from a system for one function may have a context 
and a format that are totally incompatible with the con- 
's text and format of another function. Furthermore, as en- 
terprises grow through mergers and acquisitions, the 
likelihood of inheriting incompatible systems increases. 
Consequently, the legacy systems cannot provide all the 
information necessary for effective top level manage- 
20 ment and control. 

[0009] As an additional complexity, enterprise sys- 
tems need user interfaces for front-end operations. For 
example, in the healthcare industry, administrative staff 
and health care providers need reliable access to pa- 
25 tient records. If the healthcare enterprise has evolved 
by a series of mergers, the possibility of a reception desk 
populated with half a dozen different terminals, each ac- 
cessing a different patient database and a different ac- 
counting system is a certainty, and service and profita- 
30 bility suffers. 

[0010] Generic computerized solutions that offer an 
efficient, automated way to integrate an enterprise's var- 
ious computerized systems are difficult to implement. 
Another conventional solution is to implement a custom, 
35 computerized interface between the various systems. 
However, these custom solutions are usually tailored to 
a specific enterprise environment. As a result, the tai- 
lored solutions are not portable into other situations 
without major modifications. Additionally, these solu- 
40 tions are costly to maintain over time because of inher- 
ent difficulties in accommodating change. 
[0011] Conventional solutions that meet all of the 
needs for collecting, retrieving, and reporting data in a 
complex enterprise do not exist. For example, the 
45 DASS(™) system, available from a SAP AG, of Waldorf, 
Germany, is intended to automate manufacturing func- 
tions. DASS receives information from SAP R/2 pack- 
age described above. However, DASS does not appear 
to provide a generic solution to connect a computerized 
50 business system to a computerized manufacturing sys- 
tem. 

[001 2] Figure 1 a shows an example legacy enterprise 
system 1 0. The legacy system includes as subsystems 
a SAP system 11 . an Oracle^™) database 12, one or 
55 more legacy applications 13. Lotus NotesfM) 14, a Web 
server 1 5, and user interfaces 20. The system 1 0 might 
also permit access to some functions by a mobile com- 
puter (laptop) 30 via a dial-up communications link 40. 
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[0013] More than likely, the legacy system 10 will ex- 
hibit one or more of the following problems. All sub-sys- 
tems cannot communicate with every other sub-system 
because each sub-system has its own application pro- 
gramming interfaces (APIs). Real-time data interchange 
among all of the sub-systems may be impossible or ex- 
tremely difficult because each sub-system stores and 
views data in a different way and uses different commu- 
nication protocols. Modified enterprise functions or add- 
ing automation for new functions is expensive. Each 
sub-system is developed with its own peculiar program- 
ming language. Users cannot always access all the data 
all of the time, particularly when the user is mobile. It is 
difficult to provide top level management with an ab- 
straction of all system information. 
[0014] What is needed is a system that can integrate 
various computer systems in an enterprise. The system 
needs to be able to convey transactional data between 
any number of databases regardless of their format, 
context, and access methodology. User interfaces to the 
databases need to be uniform. In addition, as enterprise 
functions change, new procedures and transactions 
must be accommodated in a minimal amount of time 
without having to redesign and reimplement any of the 
functional systems. The ideal enterprise integration sys- 
tem should be capable of adapting to any number of 
computerized functions in a modern complex enter- 
prise. 

Summary of the Invention 

[0015] The present invention is directed to a system 
and method for integrating computer systems found in 
many types of enterprises. 

[0016] An enterprise integration system is coupled to 
a number of legacy data sources. The data sources 
each use different data formats and different access 
methods. The integration system includes a back-end 
interface configured for converting input data source in- 
formation to input XML documents and for converting 
output XML documents to output data source informa- 
tion. 

[001 7] A front-end interface converts the output XML 
documents to output HTML forms and the input HTML 
forms to the XML documents. A middle tier includes a 
rules engine and a rules database. Design tools are 
used to define the conversion and the XML documents. 
[001 8] A network couples the back-end interface, the 
front-end interface, the middle tier, the design tools, and 
the data sources. Mobile agents are configured to com- 
municate the XML documents over the network and to 
process the XML documents according to the rules. 

Brief Description of the Drawings 

[0019] 

Figure 1 a is a block diagram of a legacy enterprise 



system; 

Figure 1 b is a block diagram of an integrated enter- 
prise system according to the invention; 

5 

Figure 2 is a block diagram of design tools used by 
the system of Figure 1 b; 

Figure 3 is a block diagram of XML data accesses 
10 according to the invention; 

Figure 4 is a block diagram of a back-end interface 
of the system of Figure 1b; 

15 Figure 5 is a diagrammatic of a public interface of 

the back-end interface of Figure 4; 

Figure 6 is a block diagram of pooled connections; 

20 Figure 7 is a flow diagram of a get request; 

Figure 8 is a flow diagram of an update request; and 

Figure 9 is a block diagram of an object of service 
25 bridge objects. 

Detailed Description of the Preferred Embodiments 

Introduction 

30 

[0020] Our invention provides a robust and scalable 
environment for integrating legacy enterprise computer 
systems. The invention integrates databases, transac- 
tions, and user interfaces having different formats, con- 
35 texts, and designs, such as the sub-systems shown in 
Figure la. We also provide for automated rules based 
processing. 

[0021] At the core of our integration system, we utilize 
XML as a universal data encoding and interchange for- 

40 mat. XML (Extensible Markup Language) is a flexible 
way for us to create common information formats and 
share both the format and the data on the Internet, the 
World Wide Web (WWW), intranets, and private local 
area network. XML, developed by the World Wide Web 

45 Consortium (W3C), is "extensible" because, unlike Hy- 
perText Markup Language (HTML), the markup symbols 
of XML are unlimited and self-defining. XML is actually 
asimplerand easier-to-use subset of the Standard Gen- 
eralized Markup Language (SGML), the standard for 

50 how to create a document structure. XML enables us to 
create customized "tags" that provide functionality not 
available with HTML. For example, XML supports links 
that point to multiple documents, as opposed to HTML 
links, which can reference just one destination each. 

55 These basic interfaces allow our integration system to 
view, modify and interact with linked legacy applications 
or legacy data sources. 
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System Architecture 

[0022] As shown in Figure 1b, our enterprise integra- 
tion system 100 includes the following main compo- 
nents: a back-end interface 110, a front-end interface 
120, a middle tier 130, and design tools 140. The com- 
ponents are connected by a network and mobile agents 
101 carrying XML documents 102. The mobile agents 
101 are described in greater detail in U.S. Patent Appli- 
cation Sn. 08/966, 716, filed by Walsh on November 7, 
1 997, incorporated herein in its entirety by reference. As 
a feature, the agents can travel according to itineraries, 
and agents can "meet" with each other at meeting points 
to interchange information. 

[0023] With our back-end interface 110, we enable 
read/write/modify access to existing (legacy) applica- 
tions and data sources 111. The back-end interface 
maps (or translates) data from legacy formats into the 
XML format used by our enterprise integration system 
100. 

[0024] The front-end interface 120 enable us to 
present information to users 103 using standard pres- 
entation methodologies. The front-end interface also al- 
lows the user to modify information and to generate 
transactions to initiate enterprise processes or workflow. 
The front-end interface can be modified to meet chang- 
ing requirements of the enterprise. 
[0025] The middle tier 130 uses our mobile agents 
101 to provide an infrastructure for highly flexible, robust 
and scaleable distributed applications. The middle tier 
combines server technology with a customizable busi- 
ness rules engine and an application framework. The 
middle tier also provides for the deployment of discon- 
nected applications for mobile users. That is, the middle 
tier allows the mobile user to perform tasks while dis- 
connected from the system 100. 
[0026] The design tools 140 support the definition of 
XML document formats. The design tools also allow us 
to define mappings of the XML document formats and 
the legacy data formats, and to provide for the automat- 
ed generation of forms for user presentation via the 
front-end interface. These components are now de- 
scribed in greater detail. 

Back-End Interface 

[0027] The back-end interface 110 is composed of 
one or more service bridges 112. The service bridges 
provide highly efficient access to various legacy sys- 
tems. Hereinafter, we will frequently call the legacy sys- 
tems "data sources" 111. We do not care how the legacy 
systems are programmed, or how their applications are 
structured. That is, the back-end interface of our inte- 
gration system provides a generic and uniform access 
interface to the highly diverse legacy systems without 
requiring special knowledge of internal, legacy interfac- 
es of the linked systems. 

[0028] Semantically, we model the back-end interface 



as an XML document publishing and management sys- 
tem. We seethe data source as "publishing or "serving" 
XML documents containing enterprise information. The 
back-end allows users to add, update, delete, browse, 
5 and search for documents in the data source. We chose 
this semantic model of interaction because it provides 
a generic interface through which many disparate lega- 
cy systems can be accessed. 

[0029] A particular data source 1 1 1 can manage mul- 
10 tiple types of documents, such as customer accounts, 
purchase orders, work items, work lists, and the like. 
Any document in any data source can be uniquely iden- 
tified and retrieved by a document identification (id) 1 04. 
In our implementation, and keeping within the spirit of 
15 XML, we use a document identification 104 that is con- 
ceptually similar to a Web page Universal Resource Lo- 
cator (URL), although different in detail. As shown, the 
service bridges include a bridge framework (BF) 113 
and a data source-specific runtime access component 
20 (RAC) 114. The service bridge is described in greater 
detail below with reference to Figures 4-9. 

Bridge Framework 

25 [0030] The bridge framework 113 provides generic 
high level access services for the back-end interface. 
The framework is relatively independent from the spe- 
cifics of the linked legacy systems and is implemented 
with reusable code. The bridge framework performs us- 

30 er authentication, and identifies the user making a re- 
quest of the data source. The bridge framework also 
identifies agents 101 making requests, and provides a 
means to map a generic user identity to specific "logon" 
information required by any of the legacy data sources, 

35 e.g., a username and a password. The bridge frame- 
work operates securely such that any sensitive data- 
source logon information, such as a clear-text pass- 
word, is encrypted. 

40 Connection Pooling and Document Management 

[0031] The framework also manages objects involved 
in establishing and maintaining a connection to the data 
source, and provides for connection sharing and pool- 

45 ing. Connection pooling and sharing is used when the 
establishment of a connection or session with the data 
source is too expensive to perform on a per user basis. 
The connection pooling and sharing mechanism is 
based on "user groups." All members of a user group 

50 access a particular data source via a shared connection 
pool. The connections in this pool are established within 
the user context of a "pseudo-user account." 
[0032] A pseudo-user account is a special data 
source account that represents a group of users instead 

55 of an individual user. Thus, if we have two user names, 
"John ©accounting" and "Jim ©accounting," the two ac- 
counting users both access the data source within the 
context of the accounting pseudo user account. Con- 
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nection pooling may not be necessary for all back-end 
data sources, but certainly is required for relational da- 
tabase access. 

Document Caching 

[0033] The bridge framework also provides a tunable 
caching facility to increase system performance. As 
stated above, a primary function of the back-end inter- 
face is to access legacy data and convert that data into 
the XML format. The bridge framework maintains XML 
documents in a cache 1 1 5 so that a subsequent request 
to retrieve the same data can bypass any data access 
or conversion work overhead by accessing the cached 
XML document. 

[0034] The caching in our system is tunable. For a giv- 
en type of document, a system administrator can specify 
caching parameters 116 such as whether caching 
should be enabled, a maximum lifetime before cache 
entries become stale, a maximum cache size, whether 
the cache 115 should be a persisted disk and re-used 
at next server startup. For document types that contain 
highly volatile data, caching can be disabled or cache 
entries can be set to expire quickly. For documents con- 
taining data that changes rarely, the caching parameters 
can be set aggressively to retain the documents in the 
cache. 

Runtime Access Component 

[0035] The runtime access component (RAC) 114 is 
specific for a particular data source 111 . The RAC uses 
application programming interfaces (APIs) and struc- 
tures of the legacy data source to access the data and 
to map the data into the XML format. The exact seman- 
tics of how the data are mapped to the XML format vary. 
For example, the mapping can be for widely used legacy 
databases, such as, JDBC, JDBT, SAP, or SQL. An ex- 
ample JDBC implementation is described below with 
reference to Figure 4. The RAC supports the following 
database access operations. 

Query 

[0036] The "query" operation retrieves a document 
from the data source. The caller supplies the id 1 04 of 
the document to fetch. The bridge service returns the 
specified information in the form of a XML document ac- 
cording to one of the standard programming models 
supported by W3C, for example, a DOM document ob- 
ject or a SAX document. DOM (Document Object Mod- 
el), is a programming interface specification that speci- 
fies a tree which applications may then explore or mod- 
ify. SAX is an event-based tool, more or less 'reading' 
the document to the application using a set of named 
methods to indicate document parts. SAX is typically 
used where efficiency and low overhead are paramou nt, 
while the DOM is used in cases where applications need 



random access to a stable tree of elements. The inter- 
face allows us to generate and modify XML documents 
as full-fledged objects. Such documents are able to 
have their contents and data "hidden" within the object, 

5 helping us to ensure control over who can manipulate 
the document. Document objects can carry object-ori- 
ented procedures called methods. 
[0037] In the case of a relational database, the query 
operation maps to a SQL SELECT statement with a 

10 WHERE clause specifying which record or records from 
the database are contain in the document. 

Update 

15 [0038] The "update" operation modifies existing data 
in the legacy data source. The caller supplies the /'c/of 
the document and a XML document containing only the 
fields to be modified. In the case of the relational data- 
base, the update operation maps to a SQL UPDATE 

20 statement. 

Delete 

[0039] The "delete" operation removes a document 
25 from the data source. The caller supplies the id of the 
document to delete. In the case of the relational data- 
base, the delete operation maps to a SQL DELETE 
statement. 

30 Add 

[0040] The "add" operation inserts a new document 
into the data source. The caller supplies the document 
in the form of a DOM Document object. The bridge serv- 
35 ice returns the id of the newly added document. In the 
case of a relational database, the add operation maps 
to a SQL INSERT INTO statement. 

Browse 

40 

[0041] The browse operation, also known as "buffer- 
ing," browses all of the documents in the data source of 
a certain type. The caller supplies the type of document 
to browse. The bridge service returns a browse object 
45 similar to a JDBC result set. The browse object allows 
the callerto traversethe results in either direction, jump- 
ing to the first or last document, and to re-initiate the 
browse operation. In the case of a relational database, 
the browse operation maps to a SQL SELECT state- 
so ment that returns multiple records. 

Search 

[0042] The search operation browses the data source 
55 for all documents of a certain type that meet a prede- 
fined search criteria. The search criteria can be a list of 
fields and values which the caller wants to match against 
records in the database. For example, the caller might 
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request all customer records that contain a "state" field 
matching the string "MA." The caller supplies the type 
of document to browse as well as a document containing 
the fields to be matched. The bridge service returns a 
browse object as above. In the case of a relational da- 
tabase, the search operation maps to a SQL SELECT 
statement in which the WHERE clause contains the 
LIKE operator. 

Front-End Interface 

[0043] The front-end interface 120 is responsible for 
user presentation and interaction. The front-end inter- 
face uses "forms" to allow users to view and modify in- 
formation. As an advantage, the front-end interface pro- 
vides a "thin" user interface, with simple interactivity that 
can easily be customized as the environment in the en- 
terprise changes. The front-end forms use HTML 121 , 
HTTP 122, Javascript, Javaservlets 123, Java applets, 
and plug-ins as necessary. Being Web based, the user 
103 can use any standard browser 124 to interact with 
the system from anywhere there is an Internet access 
point. 

HTTP Communications 

[0044] The HTTP is used as the communication 
mechanism between agents and users. The user 103 
browses and modifies information, and initiates proc- 
esses via the web browser 124. User requests are rout- 
ed to agents 101 via HTTP and through the Java servlet. 
The servlet 123 in turn communicates with a front-end 
service bridge 125 that serves as an interface for the 
agents 101 . 

[0045] The serv let/service bridge combination 
123/124 supports the establishment of user sessions 
that are the channel for two-way communication be- 
tween the user and the agents. Within the context of a 
session, the user can send HTTP GET or POST re- 
quests to the agents, and the agents process such re- 
quests, and send back an HTTP response. Sessions al- 
low the user to wait for an agent to arrive and allow an 
agent to wait for a user to connect. 

HTML Form Style Sheets 

[0046] We accomplish the display of information to 
users with HTML, web pages, and web forms. As stated 
above, the information that agents retrieve from data 
sources is in the form of the XML documents 102. To 
format the XML documents into a form suitable for us- 
ers, the front-end servlet 123 converts the XML docu- 
ment to a HTML page using a style sheet 1 26, e.g. XSL, 
JSP or some other data replacement technique as de- 
scribed below. The result of this conversion is the HTML 
page containing the information in a user-friendly for- 
mat. By applying the style sheet, the servlet recognizes 
and replaces certain data from the XML document and 



10 

converts the data to HTML form. 
[0047] For example, a particular XML document 1 02 
includes the following information: 
<customer>: 
5 <firstname>John</firstname> 

<lastname>Smith</lastname> 
</customer> 

[0048] The HTML style sheet 126 for this document 
is as follows: 
10 <html> 

<h 1 >'customer.f irstname'</h 1 > 
<h2>'customer. Iastname'</h2> 
</html> 

[0049] After applying the style sheet to the XML doc- 
's ument, the resultant HTML form 121 would appear as: 
<html> 

<h1 >John</h1 > 
<h2>Smith</h2> 
</html> 

20 [0050] The style sheet supports accessing all of the 
elements and attributes in the XML documents, and it- 
eration over groups of repeating elements. 
[0051] For example, an XML document contains: 
<customer type="preferred"> 
25 <firstname>John</firstname> 

<lastname>Smith</lastname> 
</customer> 

[0052] The "type" attribute of the customer is ac- 
cessed by using a syntax such as the following: 
30 'customer.attr[type]' 

which yields the value "preferred." Given a document 
containing repeating groups as follows: 
<customers> 

<customer type="preferred"> 
35 <lastname>Smith</lastname> </customer> 

<customer type="standard"> 

<lastname>Jones</lastname> 
</customer> 

[0053] The "lastname" element of the second custom- 
40 er is accessed using a syntax such a 'customer[1]. last- 
name' which yields the value "Jones." To iterate over all 
of the customers and access their "type" attributes, an 
expression such as: 

'iterate(i=customers. customer) { 
45 i.attr[type] 

can be used to produce first the string "preferred," and 
then "standard." 

Validation 

50 

[0054] The front-end interface also supports the vali- 
dation of user entered information. Field validation in- 
formation supplies some immediate feedback and inter- 
activity to the user. Field validation also increases ap- 
55 plication efficiency by detecting common errors within 
the web browser process before any other network traf- 
fic is incurred or application logic is executed. Client side 
validation can be broken down into two related levels. 
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Field-Level 

[0055] Field-level validation performs simple checks 
on user entered data to validate that the information is 
of the correct format or data type. For example, field- 
level validation can validate that a user enters numeric 
values in a particularfield, or uses a proper date format. 
We implement field-level validations with Javascript. A 
library of common validations is supplied as a script file 
on a web server. The library has a ".js" file extension. 
This script file can be included into HTML forms as de- 
sired using the <script> HTML tag. Validation is enabled 
for a field by indicating the name of an appropriate val- 
idation routine, e.g. "onChange," within an event han- 
dler of the field. The event handler is triggered when an 
INPUT field changes. Setting up validation for a field re- 
quires HTML coding as follows: 

<input type="text" name="birthdate" on- 
Change="validateDate(birthdate)"> 
The validation library provides routines forcommon data 
types such as dates, times, currency, etc. The validation 
library can also provide a pattern matching ability allow- 
ing user input to be matched against arbitrary patterns, 
e.g., a pattern $##.## to match a monetary amount. 

Cross-Field Validation 

[0056] Cross-field validation allows for more complex 
validations. In this type of validation, the contents of one 
field depends on the contents of another field. For ex- 
ample, cross-field validation can detect a situation 
where a telephone number must be entered. Such val- 
idation usually requires a more detailed knowledge of 
the requirements of the application. 

Middle Tier 

[0057] The middle tier 130 provides the "glue" that 
links the back-end and the front-end interfaces. The 
middle tier utilizes the mobile agents 101 to communi- 
cate with the interfaces. The middle tier also provides 
support for disconnected applications and users. In ad- 
dition, the middle tier customizes the system 1 00 to the 
needs of specific enterprise functions without actually 
having to rep ro gram the legacy systems. 
[0058] The middle tier supports the automation of 
complex workflow and complex validations of data that 
may require access to multiple data sources. As a fea- 
ture, the middle tier uses a rules engine (RE) 1 31 oper- 
ating on rules stored in a database 132. The rules are 
defined in a rules language, and can be retrieved by the 
agents 101 as needed. 

[0059] In a typical scenario, the user launches an 
agent 1 05 due to interaction with the browser 124. The 
agent carries an XML document, e.g., a purchase order 
106, to the rules database 132. The agent retrieves the 
appropriate rule for processing the order, such as a pur- 
chase order workflow. The agent then interprets the rule 



to appropriately route the document to the locations in 
the network specified by the rule. The rule can include 
a travel itinerary, as well as instructions on how to inter- 
act with the data sources. 
5 [0060] As an advantage, the operation of our system 
is always current. As rules change so does the operation 
of the system. The agents always execute according the 
current state of the rules database. 

10 Design Tools 

[0061] As shown in Figure 2, the primary purpose of 
the design tools 140 is to generate 141 XML document 
type definitions (DTD) 142, to specify 143 data map- 
's pings, i.e., RACs 114, to encode 144 rules 132, and to 
design 145 user interfaces 126. 

Document Type Definitions 

20 [0062] The step 141 identifies the different types of 
document information (DTD) 142 that needs to be 
shared by the various data sources 1 1 1 of the back-end 
1 1 0 and the browser 1 24 of the front-end 1 20. This in- 
formation is specified in the DTDs. For example, to 

25 share purchase order information between systems : the 
type of information needed in a purchase order needs 
to be identified, then that information needs to be en- 
coded in a corresponding DTD. In one embodiment, the 
design tools use the service bridge to extract schemas 

30 from the data sources. 

Data Mapping 

[0063] After a data source independent data format 
35 has been generated, the mappings between the XML 
format and legacy formats for a particular database 
needs to be specified as shown in Figure 3. A query op- 
eration to a relational databases 1 1 1 involves extracting 
the schema of the database by generating a SQL runt- 
40 ime access component (RAC) 1 1 4 which makes the JD- 
BC calls to the database, converting the resulting data 
into the XML format, and handing the XML document 
113 to an agent 101. The access components can be 
implemented as Java code. The agent delivers the XML 
45 to the front-end 120 for conversion to the HTML form 
121 using the style sheet 126 so that the data can be 
viewed by the user 103 using a standard browser 124. 
[0064] Conversely, the update operation converts the 
HTML form to the corresponding XML document. The 
50 XML document is converted to a legacy format and the 
RAC modifies the data source using its schema. For oth- 
er legacy data sources that are not specified by a sche- 
ma or some other metadata, the mapping may need to 
be done by means that access the APIs directly. 

55 

Rule Encoding 

[0065] After the data format definition is generated, 
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and the RAC has been specified to access the appro- 
priate data source, the next step is to encode what 
agents are going to do with the information. In a simple 
data replication system, an agent may retrieve modified 
records from a master database, travel to the location 
of a backup database, and then update the backup da- 
tabase with a copy of the modified record. This process 
involves the encoding of a specific rule. 

Designing the User Interface 

[0066] As shown in Figure 2, generating the user in- 
terface requires three steps: manipulating document 
type definitions (DTD) 145, importing DTD 146, and 
generating DTD from database schema 147. 

Authoring DTD 

[0067] The design tools 1 40 allow the system design- 
er to define, design, and manipulate XML and HTML 
DTDs. A DTD 1 42 definesthe name of thefollowing doc- 
ument elements: the contents model of each element, 
how often and in which order elements can appear, if 
start or end tags can be omitted, the possible presence 
of attributes and their default values, and the names of 
the entities. 

[0068] Because the DTDs represent many different 
types of documents in the system, this step essentially 
defines the data types of the enterprise's computerized 
applications. As an advantage, the resulting DTDs do 
not directly tie the system to any specific legacy data 
source, nor do the definitions preclude the integration of 
other legacy systems in the future. 

DTD Import 

[0069] The tools also allow one to import already ex- 
isting DTD definitions. Such functionality can be used in 
environments where DTDs have already been defined 
for standard document types. These DTDs may have 
been defined by standards bodies or a designer of the 
legacy system. 

DTD generation from Database Schema 

[0070] This part of the tools automatically generate 
DTDs from existing database schema. 

XML < — > SQL Mapping Definition 

[0071] Given the existence of the DTDs, the system 
100 provides tools that map between legacy back-end 
data formats and XML document formats. In the case of 
relational database access, these mappings link tables, 
columns, and fields from the legacy database to ele- 
ments and attributes of the XML documents as defined 
by the DTDs. This also allows the definition of several 
distinct mappings, each of which involves accessing 



slightly different information in the data source. 

Data Mappings 

5 Query Mapping 

[0072] A query mapping enables an agent to retrieve 
information from a legacy data source. In the case of a 
relational database, this mapping specifies the contents 
10 of the SELECT statement, including any information rel- 
evant for a table join. A query mapping for a purchase 
order may involve accessing a purchase order table, a 
customer table, and a product catalog table. 

15 Update Mapping 

[0073] An update mapping allows an agent to modify 
information in the data source. This involves specifying 
the contents of an UPDATE statement. An update map- 
ping for a purchase order involves updating the pur- 
chase order table, but not modifying the customer table 
or the product catalog table. 

Delete Mapping 

[0074] A delete mapping allows an agent to delete in- 
formation in the data source. This involves specifying 
the contents of a DELETE statement. A delete mapping 
for a purchase order involves deleting a record or 
records from the purchase order table, but not modifying 
the customer table or the product catalog table. 

Add/Create Mapping 

[0075] An add/create mapping allows an agent to add 
information to the data source. This involves specifying 
the contents of an INSERT statement. An insert map- 
ping for a purchase order involves adding a record or 
records to the purchase order table, but not modifying 
the customer table or the product catalog table. 

Schema Extraction and Caching 

[0076] In order to allow for mapping between a legacy 
database schema and XML DTD formats, the mapping 
design tool extracts the schema from legacy databases. 
Because schema extraction is an expensive and time 
consuming task, the tools allow one to save extracted 
schemas on a disk for subsequent use. 

Form Generation 

[0077] The tools will also allow one to automatically 
generate a form from a DTD. Such a form may require 
minor modifications to enhance the physical appear- 
ance of the form. For example, color or font size of text 
can be adjusted to enhance usability. 
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Embedding Binary Data in XML Documents 

[0078] Some enterprise applications may need to re- 
trieve arbitrary binary data from the data source 111 . For 
example, a legacy database contains employee infor- 
mation. Included with that information is a picture of the 
employee in standard JPEG format. The employee in- 
formation is stored as a single table named "employees, 
" which has a schema as Table 1 , where the field <im- 
age> represents the picture: 



Table 1 



ID 


Name 


HireDate 


Photo 


1 


John Smith 


1/1/96 


<image> 



[0079] The XML document that retrieves the above ta- 
ble appears as follows: 
<employee> 

<ID>1<1</ID> 
<name>john Smith</name> 
<hiredata> 1 996-29</hiredate> 
</employee> 

XML, by itself, does not naturally lend itself to the inclu- 
sion of binary data. To deliverthis information for display 
in a web page, the service bridge 112 could encode the 
SQL record in an XML document as follows: 
<employee> 

<ID>1<1</ID> 

<name>john Smith</name> 

<hiredata> 1 996-29</hiredate> 

< Photo h ref =" http://server/di recto ry/j oh n. 

jpeg"/> 

</employee> 

[0080] However, there are a number of problems with 
this type of approach. First, it is the responsibility of the 
user to issue the proper additional commands to retrieve 
the linked document before it can be displayed, e.g.. the 
user must click on the URL of the picture. Second, the 
DTD for the XML document must specify the URL. For 
most legacy databases, it is unlikely that the records 
storing the binary data are accessible via an HTTP U RL. 
Furthermore, the binary data is transported through the 
system by a follow on transport, such as HTTP. For re- 
liability, security, consistence, and other reasons we 
prefer to carry all data, including binary data with the 
agents. 

[0081] To allow the servlet 123 to generate an agent 
that can access the binary data, we define a new type 
of URL. The new URL incorporates the location of the 
binary data, as well as a unique "name" that can be used 
to retrieve the binary data. The URL contains the host- 
name of the data source, a service name, an action 
name that can be used to perform the retrieval of the 
binary data, and a document identification referring to 
the binary data. This still results in afairly complex URL. 
[0082] Using multiple requests to retrieve the binary 
data is inconsistent with our agent model. Agents try to 



use the network effectively by batching data into fairly 
large self-contained packets. This is very different than 
the hypertext model used on the web in which a single 
page display can lead to multiple network requests. 

5 

Compound Documents 

[0083] In an alternative solution, we define a com- 
pound document. In a compound document, the binary 
10 data is embedded in the same document as the textual 
XML data. This approach is consistent with our agent 
driven system that attempts to transport data as larger 
batches. Compound documents can be built in two 
ways. 

15 

Embed Binary Data into XML Text Element 

[0084] The binary data is embedded directly into an 
XML text element. This can be done as long as the bi- 
nary data is encoded in such a way that the data only 
contain XML characters. Such an encoding could be 
based on the Base64 encoding. With Base64, special 
characters, such as "<" and ">," are replaced with equiv- 
alent entities (i.e., < and >). We also can use a char- 
acter data (CD ATA) section to work around the problem 
of illegal characters within the Base64-encoded data. 
We may want to prefix the embedded binary data with 
standard mime headers that specify content type, en- 
coding, and name. Such a format for the photo element 
appears as follows: 
< Photo 

Content-Type: image/jpeg 
Content-Encoding: base64 
Content-Name: john.jpeg 

9j/4 AAQ S kZ J g E AS AB I AAD/ 

</Photo> 

[0085] It should be noted that this alternative increas- 
es the size of the binary data by 33% as well as increas- 
ing the overhead to encode and decode the data. 
[0086] This alternative requires that a SQL RAC ex- 
tracts the binary data and encodes the data into Base64, 
and then adds the encoded data to the XML document 
with the proper mime headers. 

Compound Document Encoded as Mime Document 

[0087] Another alternative, embeds both the XML 
document and the binary data into separate parts of a 
multipart mime document. Each part of the overall doc- 
ument has a Content-ID which is referenced from a 
standard XML link, in part, such a format appears as 
follows: 

Content-Type: multipart/related; boundary^ "-XXXXX" 

-xxxxx 

Content-Type: text/xml 
Content-ID: doc 

<Photo href="cid:photo"/> 

--XXXXX 
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Content-Type: image/jpeg 
Content-Encoding: base64 
Content-Name: john.jpeg 
Content-ID: photo 
9j/4AAQSkZJ... gEASABIAAD/ 

— xxxx — 

[0088] With this alternative, the binary data may not 
need to be encoded. However, this requires that agents 
also retrieve MIME documents via the RAC. 

JDBC Service Bridge 

[0089] Figure 4 shows details of a preferred embodi- 
ment of a service bridge 400 of the back-end interface 
110 for accessing a data source. In this embodiment, 
JDBC is used to access a SQL type of database. The 
bridge 400 includes a public interface 410, JDBC run- 
time access component (RAC) 420, XML-SQL data 
mapping 430, and a document cache 440 as its main 
components. 

Public Interface 

[0090] As stated above, the public interface 41 0 pro- 
vides the means by which agents access the data sourc- 
es 111 . The public interface allows data retrieval, mod- 
ification, and addition. As an advantage, the public in- 
terface 410 makes no assumptions about how data in 
the legacy database 111 is sourced or maintained. In- 
stead, we make the public interface resemble the GET/ 
PUT model of HTTP. 

JDBC Run-Time Access Component 

[0091] The JDBC access component 420 is respon- 
sible for establishing and managing JDBC connections, 
building and executing SQL statements, and traversing 
result sets. This component works entirely within the 
context of JDBC and SQL. 

XML-SQL Data Mapping 

[0092] The XML-SQL data mapping 430 uses the 
mapping information generated by the design tools 1 40 
to map data between XML and SQL. 

Document Cache 

[0093] The document cache 440 operates entirely 
with XML documents. XML documents that have been 
retrieved from the data source can be cached for fast 
future retrieval. The caching services are configurable 
so that maximum cache sizes and cache item expiration 
times can be specified. Caching can be disabled for cer- 
tain classes of documents which contain highly volatile 
information. 

[0094] Figure 5 shows the public interface 410 in 
greater detail. The interface supports four basic types 



of accesses, namely get 51 0, put 520, add 530, and de- 
lete 540. 

[0095] At the heart of the interface is the document id 
1 04. The document id is a string which uniquely identi- 

5 fies every document instance within the data source. 
The document id can be thought of as corresponding to 
the URL of a World Wide Web document, or to the pri- 
mary key of a record in a database. Although the id has 
a different format than a URL, it does serve as a docu- 

10 ment locator. 

[0096] In order to interact with information in the leg- 
acy data source, an agent needs to provide the /c/for 
the document containing the information. The id con- 
tains multiple sections of information and follows thefol- 

15 lowing pattern. 

[0097] The first character of the id string specifies a 
separator character (S) 501 that is used to separate the 
different sections that make up the document id, e.g., a 
colon (:). This character is used in conjunction with a 

20 Java StringTokenizer to parse the document id. The 
subsequent information in the id includes name=value 
pairs (N. V) 502. One pairs 502 specifies a document 
type, e.g., ":type=cust_list:" 

[0098] In most common cases, the id 104 also con- 
25 tains a key specifying the exact document instance in 
order to uniquely identify an individual document in a 
data source. For example, in a document containing 
customer information, this key contains a data source 
specific customer number or a customer id. Within the 
30 service bridge, this key is mapped to a WHERE clause 
of a SQL statement. For example, an agent can request 
customer information for a particular customer by spec- 
ifying an id string as follows: 

" :type=customer: key=S M ITH : " . 
35 This request results in a SQL query to the database that 
appears as follows: 

SELECT * FROM Customers WHERE Custom- 
ers. I D=SM ITH 

The exact semantics of how they key is mapped into the 
40 resu Itant SQL statement is specified by the design tools 
140. 

[0099] The key portion of the id can be composed of 
multiple pieces of information separated by, for exam- 
ple, commas. Such a key is used in cases in which the 

45 WHERE clause of the corresponding SQL query needs 
multiple pieces of information to be specified by the 
agent. An example of this is a document containing a 
list of customers, where the customers names are within 
a certain alphabetic range, for example, "all customers 

50 whose last names begin with the letters A or B. Such a 
document has an id as follows: 

" : typ e=cust_l ist_by_n ame : key= A , Bzzzz : " 
In this case, the request would map into a SQL state- 
ment resembling the following: 

55 SELECT * FROM Customers 

WHERE Customers. LastName BETWEEN A, Bz- 
zzz 
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Implementation Details of the Service Bridge 
Database Access 

User Authentication 5 

[0100] The service bridge is responsible for perform- 
ing any authentication necessary in order to establish a 
database connection. This may involve supplying a da- 
tabase specific username and password or other login 10 
information. When a database access (get, put, add, de- 
lete) is made by an agent, the bridge examines the 
agent's runtime context to determine the user identity 
associated with the agent. 

[0101] After the agent's identity has been ascer- is 
tained, the service bridge maps the identity into simul- 
taneous database-specific user identification using a 
mapping table generated by the design tools. For exam- 
ple, the mapping maps the user identity "steve@ac- 
countintf 1 into an Oracle username "steve." 20 
[0102] In order to establish a connection to a data- 
base on behalf of a user, the service bridge retrieves 
both the username and clear-text password for the cor- 
responding database user account. In such cases, the 
clear-text password is stored in the identity-mapping ta- 25 
ble. For security reasons, the table is encrypted on disk 
using a public/private key pair. 

Connection Management 

30 

[0103] To enhance performance and scalability, the 
service bridge supports database connection pools. 
This means that multiple users share a common pool of 
JDBC connections. Establishing a database connection 
can be a slow and relatively expensive operation. The 35 
use of shared connection pools decreases this expense. 
[0104] The basis for this connection sharing are "us- 
ers groups." When an agent attempts an operation 
which requires a connection to a database, the service 
bridge performs that operation using a connection es- 40 
tablished in the context of a special "pseudo-user" ac- 
count. The pseudo-user is a database system account 
that represents not an individual user, but instead a par- 
ticular group of users. A pool of such pseudo-user con- 
nections is available for use by all of the agents of the 45 
group. The service bridge generates and maintains a 
connection pool for each distinct group of users who ac- 
cess the bridge. 

[0105] Figure 6 shows agents 101 for three users torn, 
joe and david 601-603 accessing the data source 111 . 50 
Two of the users, torn© users and joe@ users, are mem- 
bers of a users group. The third user, david ©managers, 
is a member of a "managers" group. When these agents 
attempt to access the database, the two members of the 
users group share a connection pool 610 that was es- 55 
tablished with the credentials of the "users" pseudo-us- 
er. The third agent will communicate with the database 
using a separate connection pool 620 established with 



the credentials of the "managers" pseudo-user. 
[0106] A connection pool for a particular group is gen- 
erated when a member of the group makes the first ac- 
cess request. Con nections within the pool are construct- 
ed as needed. The service bridge does not pre-allocate 
connections. After a configurable, and perhaps long pe- 
riod of inactivity, the connection pool is closed to free 
database resources. If a connection pool for a particular 
group has been closed due to inactivity, then any sub- 
sequent request by a member of that group results in 
the generation of a new pool. When a request is com- 
pleted, the connection allocated for that request is re- 
turned to the pool. A maximum number of connections 
in a pool can be specified. If no connections are availa- 
ble when a request is made, then the request is blocked 
until a connection becomes available. 

Statement Construction and Execution 

[0107] The actual generation and execution of SQL 
statements is performed by a separate "modeler" object. 
The modeler object is generated by the design tools 
140. For each type of document used in the system, 
there is a distinct modeler object. Each modeler knows 
how to construct exactly one type of document. During 
the design process, one specifies what information is to 
be retrieved from the database, and how to map the in- 
formation into an XML document. The design tools se- 
rialize and save the modeler objects in a ".ser" file. At 
runtime, the service bridge loads and de-serializes the 
modeler objects from the ".ser" file. The resultant mod- 
eler objects are able to perform all of the data access 
and mapping functions required to retrieve information 
from the data sources. As stated above, SQL to XML 
data mapping is performed by the modeler object de- 
signed for a particular document type. 

Data Caching 

[0108] To improve the performance of document re- 
trieval, the data service caches database information as 
converted XML documents. When a first request is 
made to retrieve a document, the service performs the 
SQL access and SQL to XML data mapping as de- 
scribed above. The resultant XML document is added 
to the cache of documents 440 maintained by the serv- 
ice bridge. Any subsequent request to retrieve the doc- 
ument will be satisfied by retrieving the document from 
the cache, bypassing the need for an additional expen- 
sive database access and mapping. 
[0109] When an update or addition is made to a data 
source, the cache is updated to reflect the new informa- 
tion. The update to the cache is made only afterthe SQL 
statement performing the update of the end database 
has been completed successfully. This prevents the 
cache from storing information that has not been com- 
mitted to the database due to errors or to security re- 
strictions. 
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[0110] The XML document cache is configurable to 
specify a maximum size of the cache, the maximum 
amount of time a single document can be retained in the 
cache before it becomes stale, and whether the cache 
should be persisted to disk, in which case the cache can 
be re-used after a server restart. One can also custom- 
ize how different classes of documents are cached. If a 
document represents highly volatile information, then 
caching can be disabled for that class of document. If a 
document class is completely (or virtually) static, then 
documents of that class can be cached for a very long 
time. 

Execution Flow 

[0111] The following section describes the execution 
flow for basic database access requests. Figure 7 
shows the steps 700 of a "get" or retrieval access in 
greater detail. After the request is received from the 
agent 710, the caller and document identity are deter- 
mined 720. 730. The group specific cache is identified 
740, and the cache is checked 750. If the cache stores 
the document, return the document in step 755. Other- 
wise, locate the XML-SQL mapping 760, construct the 
select SQL select statement 770, retrieve the connec- 
tion 775, and execute the statement in step 780. Next, 
the result set is "walked" 785, fields are extracted 790 
to build the XML document 794, the document is cached 
796 and returned to the agent in step 798. Figure 8 
shows the steps 800 for the addition (add) and modifi- 
cation (put) similar to the get steps. The delete request 
simply deletes data from the database as shown at 540 
in Figure 5. 

Run-time Object Hierarchy 

[0112] Figure 9 shows the run-time hierarchy 900 of 
objects of the service bridge 110. The objects can be 
classified as data source independent 901, and data 
source dependent 902. The data source independent 
object 901 includes data source factory object 910 in- 
dexed by group name, group specific data source ob- 
jects 920, document factory objects 930 (one per docu- 
ment), document cache objects 940, document builder 
objects 950, connection pool objects 960, mapping table 
objects 970, document manager objects 980, and the 
data source manager objects 990. The data source de- 
pendent object 902 include source connection 991 , 
string authentication 992, document map 993, and spe- 
cific driver objects 994. 

[0113] Although the invention has been described by 
way of examples of preferred embodiments, it is to be 
understood that various other adaptations and modifi- 
cations may be made within the spirit and scope of the 
invention. Therefore, it is the object of the appended 
claims to cover all such variations and modifications as 
come within the true spirit and scope of the invention. 



Claims 

1. An enterprise integration system coupled to a plu- 
rality of data sources, the plurality of data sources 

5 using different data formats and different access 
methods, comprising: 

a back-end interface configured to convert in- 
put data source information to input XML doc- 

10 uments and to convert output XML documents 

to output data source information; 
a front-end interface including means for con- 
verting the output XML documents to output 
HTML forms and for converting input HTML 

15 forms to the XML documents; 

a middle tier including a rules engine and a 
rules database; 

design tools for defining the conversion and the 

XML documents; 
20 a network coupling the back-end interface, the 

front-end interface, the middle tier, the design 

tools, and the data sources; 

a plurality of agents configured to communicate 

the XML documents over the network and to 
25 process the XML documents according to the 

rules. 

2. The system of claim 1 wherein the back-end inter- 
face further comprises: 

30 

a public interface; 

a document cache; and 

a run-time access component. 

35 3. The system of claim 2 wherein the public interface 
forwards the input XML document to the plurality of 
the agents for distribution, and the public receives 
the output XML documents for storing in the plurality 
of data sources. 

40 

4. The system of claim 2 wherein the document cache 
includes caching parameters. 

5. The system of claim 2 wherein the caching param- 
45 eters include a maximum lifetime for each cache en- 
tries, a maximum cache size, and a persistency in- 
dicator. 

6. The system of claim 2 wherein the run-time access 
component generates access requests for the plu- 
rality of data sources. 

7. The system of claim 6 wherein the access requests 
include query, update, delete, add, browse, and 
search. 

8. The system of claim 1 wherein the XML documents 
include binary data. 
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9. The system of claim 8 wherein the binary data is 
referenced by a Universal Resource Locator. 

10. The system of claim 8 wherein the binary data is 
embedded as a compound document. 5 

1 1 . The system of claim 1 0 wherein the compound doc- 
ument embeds the binary data as an encoding in a 
character set. 

10 

1 2. The system of claim 1 0 wherein the compound doc- 
ument embeds the binary as a MIME document. 

13. The system of claim 1 wherein the input documents 

are presented to a browser. 15 

1 4. The system of claim 1 wherein each XML document 
is identified by a document identification. 

15. The system of claim 14 wherein the document iden- 20 
tification is a character string. 

16. The system of claim 15 wherein the character string 
includes a plurality of sections, and a first character 

of the string is a section separator. 25 

1 7. The system of claim 1 6 wherein one of the sections 
stores a document type. 

1 8. The system of claim 1 5 wherein one of the sections 30 
stores a key to an instance of the XML document in 
one of the data sources. 

19. The system of claim 1 wherein the back-end inter- 
face performs user authentication. 35 

20. The system of claim 1 wherein the back-end inter- 
face supports database connection pools. 

21. A method for integrating a plurality of data sources, 40 
the plurality of data sources using different data for- 
mats and different access methods, comprising: 

converting input data source information to in- 
put XML documents and converting output 45 
XML documents to output data source informa- 
tion; 

converting the input XML documents to input 
HTML forms and converting output HTML 
forms to the output XML documents; 50 
providing a rules engine and a rules database; 
defining the converting and the XML docu- 
ments; 

communicating the XML documents over a net- 
work using agents; and 55 
processing the XML documents by the agents 
according to the rules database. 
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FIG. 5 



Get a document from a data source 



@ param 



* 
* 



@ return 



id The id of the document. The id should 
at least contain the document class and unique 
document id. The. id may also contain information 
specific to the back end data source such as further 
processing instructions or identification information. 
A DOM Document object containing the XML data 



public Document get (String id); ^v, 510 



★ 
* 



Update an existing document in the data source 



param id the id of the document. The id should 
at least contain the document class nad unique 
document id. The id may also contain information 
specific to the back end data source such as further 
processing instructions or indentification information. 

param doc The new document to commit to the data source. 



7 



public void put (String id, Document update); 520 

jte* 

* Add a new document to the data source. 

* @ param id A partial id for the document. The id should 

* contain the document class. A unique document id 

* will be generated for the document and returned by the 

method. The id may also contain information 

* specific to the back end data source such as further 
V 

public String add (String id, Document doc); ^ 530 



Delete a document 
@ param 



id The id of the document. The id should 
at least contain the document class nad unique 
document id. The id may also contain information 
specific to the back end data source such as further 
processing instructions or identifications information, 



7 



public void delete (String id); 'Xj 540 
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